perm filename SCAN.BBM[UP,DOC]1 blob
sn#121266 filedate 1974-09-22 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00012 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002
C00003 00003 1) GENERAL DESCRIPTION OF THE PACKAGE.
C00005 00004 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
C00009 00005 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
C00038 00006 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
C00058 00007 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
C00061 00008 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
C00071 00009 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
C00078 00010 APPENDIX 1: Initial values for CHARCLASS
C00083 00011 Appendix 2: Legal REQUIREments.
C00085 00012 APPENDIX 3: Reserved words.
C00088 ENDMK
C⊗;
A few (hopefully) useful routines Bertrand Meyer, August 1974.
Unindentified: [Unintelligible].
("The Watergate Tapes", NEWSWEEK, 7/29/74).
These notes describe a package of input/output,
scanning, symbol-table searching and macro-processing routines,
written originally for HAL (the hand language), but general enough to
be used by any compiler written in SAIL.
1) GENERAL DESCRIPTION OF THE PACKAGE.
1.1 Where to find it.
1.2 What to read in this document if you just want to do
simple things.
1.3 List of the principal procedures.
2) SCANNING.
2.1 Action of the scanner.
2.2 The optional symbol table.
2.3 Character types.
2.4 Possible tokens
2.4.1 Single-character delimiters.
2.4.2 Constants.
2.4.3 Identifiers.
2.4.4 Two-character delimiters.
2.5 File switching
2.6 Adding types and reserved words.
2.7 Example.
3) MACRO-PROCESSING.
3.1 Basic.
3.1.1 Macro Definitions.
3.1.2 Simple Macro Calls.
3.2 Tricky.
3.3 Discussion; comparison with GPM.
4) DESCRIPTION OF THE PROCEDURES LEXAN and METALEXAN.
5) OTHER PROCEDURES.
5.1 PLEASE_ANSWER.
5.2 OPEN_LOOKUP_ENTER.
5.3 SEARCHINSERT.
5.4 BREAKTABLES.
5.5 INITSCAN.
5.6 ERROR.
5.7 PRINTABLE.
5.8 NEWRES.
5.9 NEWTYPE.
5.10 NEWBLOCK and LEAVEBLOCK.
6) USER'S GUIDE.
6.1 Requiring the routines.
6.2 Options.
6.3 "Debug" mode.
6.4 Test program.
APPENDICES:
1 Initial values for CHARCLASS.
2 Legal requirements.
3 Reserved words.
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
≡ 1) DESCRIPTION OF THE PACKAGE. ≡
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
1.1 Where to find it.
-------------------------
All the files listed below are on [CSP,SYS].
INIT[CSP,SYS] contains some declarations, and REQUIREs
the following LOAD_MODULES:
"IOREL[CSP,SYS]"
and
either SEARCH[CSP,SYS] (which itself REQUIREs HASH[CSP,SYS]),
or NOSEAR[CSP,SYS].
Two source files: ABBREV.SAI[CSP,SYS] and MACMAC.SAI[CSP,SYS]
are also REQUIREd by INIT.SAI.
REQUIRing INIT.SAI[CSP,SYS] as a SOURCE_FILE will give access
to all the procedures described below.
1.2 What to read in this document if you just want to do simple things.
---------------------------------------------------------------------------
If you just want to use a standard scanner and possibly a
simple macro facility, then read sections:
2 (scanning)
3.1 (simple macroprocessing), and
6 (user's guide).
1.3 List of the most important procedures.
----------------------------------------------
(Complete list is given in sections 4 and 5).
* PLEASE_ANSWER: Understands an answer typed in at a teletype
in most of the known indo-european languages.
* LEXAN: reading from a specified program file, scans a
token and returns some information. Optionnaly, looks up and/or
inserts new id's into a bucket-hash symbol table (by calling
SEARCHINSERT; see below). The number of possible tokens, as well as
the "meaning" of all 128 ASCII characters, is initialized in a
standard way but can be changed easily by the calling program. LEXAN
also expands macros.
* SEARCHINSERT: Searches for an id in a bucket-hash symbol
table, inserts it if necessary, and returns a boolean value.
* INITSCAN: Initializes some variables for the scanner, opens
a file for scanning, calls BREAKTABLES, and stores some reserved
words into the symbol table.
* METALEXAN: This procedure is an "interface" between LEXAN
and the calling program. It is normally called in lieu of LEXAN.
METALEXAN calls LEXAN and traps a few special tokens, so that
macro-definitions and REQUIRE constructs (which allow the user to
switch files, or change the properties of individual characters) are
invisible to the calling program. A simple macro in INIT allows one
to get rid of METALEXAN.
* ERROR: prints a message and the scanning context.
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
≡ 2) SCANNING. ≡
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
2.1 Action of the scanner.
------------------------------
The scanner works on a file open on channel PROGCHAN
(INITSCAN will open it for you).It is written as a DO statement, and
will loop until a legal token is found; thus, comments and illegal
configurations never return a token to the calling program.
The scanner will read the input text, gobbling characters
called "separators" (usually blank, form feed, carriage return, etc.),
which will not appear as part of a token, until it finds a
"significant" character which signals the beginning of a token. It
will then evaluate that token and its "type".
The effect of one call to the scanner is essentially to give
new values to three global variables: SYMB, the string representation
of the token scanned; TOKEN , an integer which indicates the type of
this token; and (if the symbol-table module is used, and the scanner
has found an identifier), NEW_ID, a pointer to a symbol table entry.
2.2 The optional symbol table.
----------------------------------
Optionally, all tokens recognized by the scanner as
identifiers (exactly on what basis is explained in 2.3 and 2.4) will
be searched and/or inserted in a symbol table. The user can get rid
of this feature by erasing a line in INIT.SAI (see "user's guide",
section 6).
The symbol table uses bucket hashing. The hash function is:
(first letter (modulo 8) * (last letter (modulo 8)) . ***
A symbol table entry is a SAIL RECORD_POINTER of type ENTRI. An ENTRI
record has the following fields:
- INTEGER RTYPE {will contain a code for the type of the
identifier; it is thais value which will be returned as TOKEN};
- INTEGER VAL {intended for a value of some sort. The scanner
uses this field for identifiers which are macro names).
- INTEGER BLOCKLEVEL (intended for use by a compiler for a
block-structured language. See functions NEWBLOCK and LEAVEBLOCK in
section 5.).
- STRING ITEMVAR NAME {The name of the identifier; 1000
NEW_ITEMS are allocated initially in INIT.SAI}. A STRING ITEMVAR is
used becaure SAIL does not currently have STRING components in
records. The name of the id. is DATUM of the above itemvar.
- RECORD_POINTER(ENTRI) LINK.
- RECORD_POINTER(ANY_CLASS) SEMANTICS, intended for whatever
other field(s) the user might need.
The hash function utilized is obviously not the best in the
world, although statistical tests conducted on big SAIL programs gave
quite honorable results. Replacing HASH.REL by your own file will
allow you to use the hash function of your dreams.
2.3 Character types
-----------------------
The scanner's decisions are based upon the values of an array
CHARCLASS[0:127], which gives the "class" of any ASCII character. The
possible "classes" are the following integer macros:
- LETTER.
- NUMBER.
- DELIMITER.
- DECIMALDOT.
- ILLEGAL.
- ENDFILE.
- OCTAL: An "octal" character warns that what's ahead should
be interpreted as an octal constant.
- QUOTE: Delimits a string. A string can contain a "quote"
character, which has to be repeated.
- SEPARATOR: usually blank, carriage return, form feed etc.
These are not significant characters, but will terminate an
identifier, a number, etc.
- CCOMMENT: Begins a comment; ENDCOMMENT: Ends a comment.
These are initialized to "{" and "}". Comments are ignored. Note that
comment delimiters don't nest, i.e. the comment is understood to be
finished with the first } encountered after a {. Normally, the
construct COMMENT.......<semicolon> will also be understood.
- MACROCALL ; ENDMACRO. (Used to delimit special macro calls;
see section 3.2.1).
- MACROBODYSTART ; MACROBODYEND. (Delimit macro bodies in
macro definitions; see section 3.1.1).
- DONTEVAL (see section 3.1.2).
In the rest of this document, "a NUMBER character", "an
SEPARATOR character", etc., mean "a character of CHARCLASS 'NUMBER'
", "a character of CHARCLASS 'SEPARATOR' ", etc.
INIT (the program in the "source_file" that must be REQUIREd
by the user's program) preloads CHARCLASS as shown in Appendix 1. If
METALEXAN (an optional interface between the scanner and the calling
program) is included, the scanner will recognize constructs of the
form:
REQUIRE "<char1> <char2>" COMMENT_DELIMITERS;
(or MACROBODY_DELIMITERS,
or MACRO_DELIMITERS). or
REQUIRE "<character>" ILLEGAL;
(or LETTER, or DELIMITER,
or SEPARATOR, or DONTEVAL,
or OCTAL).
which will reinitialize CHARCLASS. Several REQUIRE constructs can of
course be gathered, as in
REQUIRE "#&" COMMENT_DELIMITERS, "{}" MACRO_DELIMITERS, etc.
Note that CHARCLASS of a character is unique, so that a
REQUIRE giving a new "meaning" to a character will delete the old
one; also, the two elements of a pair of COMMENT_DELIMITERS,etc.,
must be different.
Since the scanner uses break tables, it is not enough to
change some values of CHARCLASS in order to give different "meanings"
to characters; one must then call the procedure BREAKTABLES, which
will reinitialize the breaktables. (This is exactly what the scanner
does when it sees REQUIRE constructs like those above).
2.4 Possible tokens.
------------------------
The integer variable TOKEN will receive, after each call to
the scanner, a code corresponding to the type of the token scanned.
Possible types belong to four classes: single-character
delimiters; constants of various types; identifiers; and
two-character delimiters.
2.4.1 Single-character delimiters.
------------------------------------
For one-character delimiters, the value of TOKEN will be the
ASCII code, i.e., a number between 0 and 127.
**************************************************************
EXAMPLE
The standard values are assumed for CHARCLASS.
In this example, as in all examples below, the SAIL string
variable SYMB is given by the characters it contains (without
the enclosing quotes).
(input) (TOKEN) (SYMB)
; (octal)73 ;
← (octal)137 ←
& (octal)46 &
**************************************************************
The "pseudo-token" TDELIMITER is a global integer variable,
set to TOKEN if a delimiter is recognized, and to an absurd value
otherwise. Thus a test for TOKEN = TDELIMITER just after a call to
the scanner will have the desired effect.
2.4.2 Constants
-----------------
The following possible tokens are provided (the names below are
macros whose values are negative integers):
TINTEGER: Integer constant.
TREAL: Real constant.
TSTRING: String constant.
TENDFILE: <End of file>.
An "integer constant" is any string of characters with
CHARCLASS equal to NUMBER (normally the digits 0 through 9), or a
character of CHARCLASS equal to OCTAL followed by zero or more
SEPARATOR characters followed by a string of NUMBER characters. In
the latter case, SYMB will be the string representation of the
decimal integer equal to the octal integer scanned (this may seem
awkward but allows one to have only one type of integer constants
returned to the calling program). No check for integer overflow, or
for the legitimacy of octal digits, is made.
A "real constant" is a string made of any positive number of
NUMBER characters and exactly one DECIMALDOT character.
A "string constant" is any sequence of characters enclosed in
QUOTE characters. The usual convention about string constants is
made, i.e., any QUOTE character which is part of a string must be
repeated.
When an <End of file> is scanned, it may be one of three
things: the end of a macro text that is being expanded (see section 3
on macros); the end of a file required by the original scanned file;
or the end of this original file. In the second and third cases, the
scanner will close the channel associated with the given file; in the
first and second cases, it will pop the stack POPCHANSTACK which is
used to keep track of both the REQUIREd source_files and the macros
called. This should be more clear after paragraphs 2.5 (source file
switching) and 3 (macros) have been read.
*************************************************************
EXAMPLE
The standard values are assumed for CHARCLASS.
(input) (TOKEN) (SYMB)
2 tinteger 2
-9999 (octal)55 -
(NEXT SCANNER CALL)
tinteger 9999
'27 tinteger 23
"A string" tstring A string
"With a "" quote" tstring With a " quote
*************************************************************
2.4.3 Identifiers.
--------------------
An identifier begins with a LETTER character, and includes
all LETTER and NUMBER characters immediately following it. Letters
will be converted to upper-case.
Possible tokens for identifiers other than reserved words are
negative macros. When an identifier is encountered for the first
time, TOKEN receives the value TNONDECLARED.
If the symbol table feature is present, then the scanner will
systematically search identifiers in the symbol table. Normally, it
will also insert those which were not there before; setting the
global boolean DONTINSERT to TRUE will disable this last feature:
i.e., id's will be searched but not inserted. TOKEN will be
TNONDECLARED for new identifiers; for declared identifiers, it will
have the value of the RTYPE field of the corresponding symbol table
entry (which the calling program has of course the responsibility to
fill). Provided are the following possible values:
TINTVAR: Integer variable (not to be confused with TINTEGER).
TREALVAR: Real variable.
TSTRINGVAR: String variable.
TCOMPLEX.
TLABEL.
TRESERVED: Reserved word. If the user wants a single symbol table
for identifiers and reserved words, it is suggested,
rather than to use TRESERVED, to give a token number
to each reserved word by use of the function NEWRES,as
explained in section 2.6.
TSANSTYPE: For an identifier already encountered, but whose RTYPE
field has not been initialized.(Macro TSANSTYPE
has value 0, so ENDFILE should not have DELIMITER as
its CHARCLASS value).
TFRAME, TTRANS, TPLANE, TVECTOR (these are used by HAL).
TTYPE: Intended for an identifier which will be the name of
a type other than the standard types above.
Ways of adding new types are described below in
section 2.5.
TCLASS: Intended for classes of types. Useful for a compiler
written in PL.
When an identifier of any type is found, the global variable
TIDENTIFIER will be set to TOKEN by the scanner prior to exit, so
that a test for TOKEN = TIDENTIFIER will have the desired effect.
Also, when an identifier is found (new or not), the scanner will
point the global RECORD_POINTER variable NEW_ID to the corresponding
entry in the symbol table.
*************************************************************
EXAMPLE
The standard values are assumed for CHARCLASS; the symbol-table
module is assumed to be present.
(input) (TOKEN) (SYMB) (comment)
Strike1 tnondeclared STRIKE1 conversion to upper-case
chow_mein tnondeclared CHOW_MEIN
STRIKE1 tsanstype STRIKE1 (unless the calling program
has filled the RTYPE field
of the entry for STRIKE1 in
the meantime. If there was
no symbol table, TOKEN would
TNONDECLARED again.).
In all three cases, TOKEN = TIDENTIFIER is TRUE after the scanner has been
called, and NEW_ID (a RECORD_POINTER of type ENTRI) will point to
the symbol table entry of the scanned id.
*************************************************************
2.4.4 Two-character delimiters.
---------------------------------
These can be recognized only if the "symbol-table"
feature is used. No two-character delimiter is provided initially.
Saying:
REQUIRE "<char1><char2>" DOUBLE_DELIMITER;
will create one. <char1> must be a DELIMITER character; <char2> can
be any character. The symbol table must be used. Two-character delims
are stored like reserved words and receive a token number greater
than 128. It is this value that TOKEN, along with TDELIMITER and the
other "pseudo-token TDOUBLEDELIM, will get in subsequent scans of the
double delimiter.
**************************************************************
EXAMPLE
The standard values are assumed for CHARCLASS.
(input) (TOKEN) (SYMB) (comment)
REQUIRE ":=" DOUBLE_DELIMITER;
:= (a value≥128) := TOKEN = TDELIMITER is true.
REQUIRE "J←" DOUBLE_DELIMITER; illegal: J is not a delimiter.
REQUIRE "J" DELIMITER;
REQUIRE "J←" DOUBLE_DELIMITER; Now legal.
J← (a value≥128) J←
J (octal)112 J J is now a delimiter
(TOKEN = TDELIMITER now true).
j tnondeclared J But j is still a letter!
(Though it will be converted
to upper-case).
**************************************************************
2.5 File switching.
-----------------------
The construct
REQUIRE "<File>" SOURCE_FILE;
causes the scanner to switch to <File> until an <end of file> is hit.
It can appear anywhere a token would be expected and will be trapped
by METALEXAN.
*************************************************************
IMPORTANT NOTE: <end of file> is not a separator.
If file MINE contains:
9 HAMBURGERS REQUIRE "YOURS" SOURCE_FILE;IES.......
and file YOURS consists of
AND FRENCH_FR<End of file>
then successive scans will yield:
(TOKEN) (SYMB)
tinteger 9
<id of some sort> HAMBURGERS
<id of some sort> AND
<id of some sort> FRENCH_FRIES
............
*************************************************************
2.6 Adding new types and reserved words.
-----------------------------------------
To add a reserved word that will be stored in the same symbol
table as other identifiers and will return as TOKEN a number > 200
(tokens between -100 and 200 are reserved by the scanner), one can
use the function NEWRES. Saying
<Integer variable> ← NEWRES("BEGIN")
will do just that; <Integer variable> will receive the number
attributed to the new reserved word.
There are basically three ways to add non-standard types
(i.e., types not listed in 2.3.3). Suppose one wants to define a new
type TQUATERNION for identifiers. The way one will do it will depend
on how one's compiler is written:
- If it is strictly a source SAIL program (i.e., if one wants
to be able to say "IF ENTRI:RTYPE[NEW_ID] = TQUATERNION THEN..."
later in the program), then one should make use of the macro
DEFNEWTYPE: Saying DEFNEWTYPE(TQUATERNION) will define a new macro
TQUATERNION whose value differs from any one previously attributed to
a type.
- However, if the compiler uses some sort of bootstrapping
(e.g., is written in PL), then it might be useful to consider the new
type name as a meaningful identifier. In such a case, one should use
the function NEWTYPE. Saying:
<Integer variable> ← NEWTYPE("TQUATERNION")
will load the identifier TQUATERNION into the symbol table; the RTYPE
field of the new entry will receive the value TTYPE, and its VAL
field will receive a value different from any one previously
attributed to a type.
If its third argument (a boolean) is TRUE, INITSCAN will
carry on the above process for all standard type names (which are
thus reserved words for the scanner). If NEWTYPE is called for a
standard type (e.g. NEWTYPE("TINTVAR")), the latter will be
recognized as already declared and will not be given a new number.
- Another possibility is to let the user's program define new
types: When recognized by METALEXAN, the construct
REQUIRE "TQUATERNION" NEW_TYPE;
will cause a call to NEWTYPE(TQUATERNION).
Types are attributed negative values; standard types have
values -20 to -34. The STRING ARRAY TOKENAME[-60:0] contains the
names of the types (and other tokens); for the first 28 new types,
NEWTYPE will update TOKENAME.
2.7 Example.
----------------
Assume CHARCLASS has the values shown in appendix 1.
Suppose the input file consists of the following characters:
273 all-beef "Alioto""s" TACOS ;
2.3
gallons of PEPSI {No ice, please!} {But cool}
{with whipped cream}
REQUIRE "MORE" source_file;
1 BEEF steak with_ketchup<end_of_file>
If file MORE contains:
'75 cocas <end of file>
then successive scannings will return:
(TOKEN) (SYMB) (comment)
tinteger 273
tnondeclared ALL Conversion to upper_case.
TIDENTIFIER=TOKEN is now true.
<octal 55> - TDELIMITER=TOKEN is now true.
tnondeclared BEEF
tstring Alioto"s No upper-case conversion.
tnondeclared TACOS
<octal 73> ; TDELIMITER=TOKEN is now true.
treal 2.3
tnondeclared GALLONS <Carriage return> and <lf> are SEPARATORS.
tnondeclared OF
tnondeclared PEPSI
tinteger 61 Switch to file MORE; conversion to decimal.
tnondeclared COCAS
TAB ignored; the <end of file> causes
scanning to resume on the first file.
tinteger 1 The comments do not return a token.
tsanstype BEEF Identifier BEEF appeared before, but was given no type.
tnondeclared STEAK
tnondeclared WITH_KETCHUP CHARCLASS of "_" is initialized to LETTER.
tendfile NULL Here the program will automatically close channel PROGCHAN.
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
≡ 3)Macro-processing. ≡
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
3.1 Basic
-------------
3.1.1 Macro-definitions
-----------------------
A macro with no arguments is defined by
DEFINE <Macro name> = ⊂ <Macro body> ⊃;
(< and > are meta-symbols in the above line).
<Macro name> is any identifier which has not been encountered
before, except that it may evaluate to the name of a macro already
defined, in which case DEFINE means "redefine".
⊂ and ⊃ are the initial macrobody delimiters; they can be
changed or supplemented using a REQUIRE construct (section 2.2).
<Macro body> is any sequence of characters. "Macrobodystart"
and "macrobodyend" characters can be nested; METALEXAN (which handles
macro definitions) will keep gobbling text until the "macrobodyend"
characters match the "macrobodystart" 's.
A macro with arguments is defined by
DEFINE <Macro name> (arg1,arg2,...,argn) = ⊂macrobody⊃
arg1, arg2, .. ,argn must be legal non-declared identifiers.
Macro arguments are treated as macros of their own. Attempt to use as
an argument in a macro definition a macro already known to the
program (e.g. an argument in a previous macro definition) will give
an error message, but will be understood correctly. The user will be
asked if she wants to be notified of this type of "error" in the
future. (Note: This is a bug, not a feature. I did not have time to
fix it. Try not to use the same dummy argument names for macros which
can call one another).
Macro bodies are stored "as is", that is, without any
evaluation. Therefore, if they contain macrocalls, these will be
evaluated for each use of the macro. This allows some trickeries (see
paragraph 3.2), and was chosen as a standard option because space in
HAL is more of a problem than time.
The rules for the appearance of macro arguments inside the
macro body are identical to those governing the appearance of macro
calls anywhere in the text (see below).
*********************************************************************
EXAMPLES:
DEFINE PRESIDENT = ⊂NIXON⊃;
DEFINE PISMO_NATACHI = ⊂IA K VAM PISHU : CHEVO JE BOLIE ?⊃;
DEFINE LES_CHATS = ⊂"LES AMOUREUX FERVENTS ET LES SAVANTS AUSTERES "⊃;
DEFINE ADDITION(X,Y,Z) = ⊂ X + Y +Z ⊃;
DEFINE DILEMNA(VERB) = ⊂TO VERB OR NOT TO VERB⊃;
DEFINE ERLKONIG(XXX,YYY) = ⊂DER XXX HALT SEINEN YYY⊃;
DEFINE MAC (ZZZ) = ⊂ ZZZ; ZZZ; DEFINE RRR = ⊂UUU⊃;RRR⊃
One can define several macros in one "statement":
DEFINE FOO = ⊂BAZ⊃, PRESIDENT = ⊂NIXON⊃;
*********************************************************************
3.1.2 Simple macro calls
------------------------
Macro calls can occur at any place where a token would be
expected. In that case, the macro call is simply
<Macro name> or <Macro name>(arg1, arg2, .....,argn) if the
macro was defined with arguments.
We call these "simple macro calls" because they do not
require the use of a special warning character, whereas the "tricky
macro calls" described below can occur anywhere in a program provided
they are enclosed in special characters.
*********************************************************************
EXAMPLES (→→→ means "expands to"; the definitions of 3.1.1 are
assumed):
PRESIDENT →→→ NIXON
PISMO_NATACHI →→→ IA K VAM PISHU : CHEVO JE BOLIE ?
LES_CHATS →→→ "LES AMOUREUX FERVENTS ET LES SAVANTS AUSTERES "
ADDITION(25,3,1) →→→ 25 + 3 + 1
DILEMNA(COMPUTE) →→→ TO COMPUTE OR NOT TO COMPUTE
ERLKONIG(VATER, SOHN) →→→ DER VATER HALT SEINEN SOHN
MAC (AAA) →→→ AAA; AAA; UUU (and RRR defined to be "UUU").
*********************************************************************
An argument to a macro-call (like arg1,...,argn in the
example above) may not normally contain any comma or right
parenthesis, since these are recognized by the scanner as delimiters
between arguments. It is possible, however, to use macro arguments
that contain such characters; they must be enclosed in the same sort
of delimiters as macro bodies in definitions(originally ⊂ and ⊃), and
will be evaluated according to the same rules (i.e., nesting is
permitted). For instance, assume we have:
DEFINE ADDITION(A,B,C) = ⊂ A + B +C ⊃
Then
ADDITION(F[X,Y,Z])
evaluates to
F[X + Y + Z]
whereas
ADDITION( ⊂ F[X,Y] ⊃, G[U], ⊂ H[I,J,K] ⊃)
will give
F[X,Y] + G[U] + H[I,J,K]
and
ADDITION(F[X,Y],G[U],H[I,J,K])
is an error (understood as "too many arguments"). There is no notion
of matching brakets like in SAIL.
The default option in the scanner is "EVAL", so that a macro
will always normally be evaluated. However, the "EVALSWITCH" flag is
set to OFF when a "donteval"-CHARCLASS character is encountered
(initially the only one such character is ` ). Evaluation resumes
when another of these control characters is encountered.
WARNING: If a macro name is re-DEFINEd without being enclosed
in "donteval" characters, then it will be evaluated using the old
<Macro body>. Thus
(assuming the definition above:DEFINE PRESIDENT = ⊂NIXON⊃):
DEFINE `PRESIDENT` = ⊂FORD⊃;
will redefine macro PRESIDENT as
FORD; but
DEFINE PRESIDENT = ⊂FORD⊃
will have the effect of defining a new macro NIXON with value FORD.
(Any subsequent call to PRESIDENT will evaluate to NIXON, which
itself evaluates to FORD. Thus PRESIDENT also evaluates to FORD in
this case because NIXON is a valid identifier. An error message would
be issued only if identifier NIXON had been given a type other than
TMACRO in the meantime).
3.2 Tricky
--------------
Macro calls can actually appear anywhere in a program. To be
rcognized as such in a place where a token would not be expected to
begin, e.g. in the middle of an identifier or of a number, a macro
name must be enclosed ibetween a "MACROCALL" character (initially $)
and a "MACROEND" character (initially %). Only the macro name must be
enclosed in these delimiters, not the argument list.
Any pair of (different) characters can be made "special macro
delimiter" by saying:
REQUIRE "<char1><char2>" MACRO_DELIMITERS;
A "MACROCALL" character means:
******* CAUTION: MACRO CALL AHEAD ***************
and a "MACROEND":
******* YOU HAVE BEEN WATCHING A MACRO IN ACTION **********.
They can be nested to any degree, that is, the evaluation of
a macro name can require the recursive evaluation of a macro call.
The above rules allow a good deal of trickery (and
non-readability of programs). Below are a few examples. Recall that
macro arguments are evaluated according to the same rules as macro
names.
***************************************************************************
EXAMPLES
DEFINE VEGETABLE = ⊂POTATO⊃, BAKED = ⊂BURNED⊃, MEAL = ⊂VEGETABLE⊃,
DINNER(WHAT, HOW) = ⊂ONE_$HOW%_$WHAT%⊃;
BAKED_VEGETABLE →→→ BAKED_VEGETABLE
BAKED_$VEGETABLE% →→→ BAKED_POTATO
BAKED_$MEAL% →→→ BAKED_POTATO
$BAKED%$MEAL% →→→ BURNEDPOTATO
$BAKED%_$MEAL% →→→ BURNED_POTATO
DINNER(MEAL,BAKED) →→→ ONE_BURNED_POTATO
ONLY_$DINNER%(MEAL,BAKED) →→→ ONLY_ONE_BURNED_POTATO
DEFINE MAC1 =⊂3⊃; DEFINE MAC2= ⊂1⊃;
DEFINE MAC3= ⊂2⊃; DEFINE MAC13 =⊂99⊃;
MAC$MAC3% →→→ 1 (MAC3 expands to 2, so the whole expression
expands to MAC2, which evaluates to 1).
9$MAC$MAC$MAC3%%$MAC1%%9 →→→ 9999 (development left to the reader).
***************************************************************************
In an even trickier example (adapted from GPM), suppose we
want to define a "successor" operation for digits 0 through 9. One
way to do this is:
DEFINE SUCCESSOR(X) = ⊂DEFINE `SUC`(Y0,Y1,Y2,Y3,Y4,Y5,Y6,Y7,Y8,Y9)
= ⊂Y$X%⊃;SUC(1,2,3,4,5,6,7,8,9,0)⊃;
Suppose we call SUCCESSOR:
SUCCESSOR(3)
SUCCESSOR(4)
The call SUCCESSOR(3) causes a new definition of macro SUC to
be accepted (notice the importance of having SUC enclosed in
"donteval" quotes). SUC has arguments Y0,Y1,.....,Y9 and body Y$X%
(recall that macro bodies are never evaluated at definition time).
The new SUC is immediately called with arguments 1,2,....,9,0; in
other terms, the "macro" Y0 is given the "value 1, Y1 receives 2,
etc. When the body "Y$X%" of SUC is evaluated, X is recognized as a
"macro" with value 3, and Y$X% thus evaluates succesively to Y3 and
4. Similarly, SUCCESSOR(4) will give 5.
Macro evaluation is turned off in two cases. The first one,
as we have already seen, is macro body evaluation, in macro
definitions. The other is macro argument evaluation in macro calls.
Thus the following will not work:
DEFINE ADDITION(A,B) = ⊂ A + B ⊃;
DEFINE COMMA = ⊂,⊃, RIGHTPAREN = ⊂)⊃;
ADDITION(12 COMMA 22 RIGHTPAREN
Note: Parenthesis and commas in macro (definition or call)
argument lists are the only example of characters being referenced by
their names rather than by their CHARCLASS values.
Of course, parameters to a macro call may themselves require
evaluation of macros. In that case, however, these will be evaluated
anew whenver referenced in the macro body.
3.2 Discussion, and comparison with GPM.
--------------------------------------------
This system is intended to allow the user to do
simple macro-processing easily and with the slightest possible
overhead, while giving her the possibility to do much more
sophisticated things if she really means it.
The amount of back-up involved in macro evaluation
has been limited by use of the following method: a macro-call is
treated essentially the same way a
REQUIRE "<Filename>" SOURCE_FILE
is, i.e., a macro text (rather than a source file name) is pushed
onto a stack, and the reading procedure receives the order to
accept characters from a text string rather than from a
file. Recursive macro calls are easily implemented in this scheme.
"End of macro" and "End of file" receive exactly the same treatment:
pop the stack.
The power of this system is about equivalent to that of GPM;
however, the design differs from GPM on a number of points:
- GPM is viewed as a preprocessor which needs a complete scan
of the source text by itself, not as a component of a one-pass
scanner.
- The GPM convention that a macro definition is just a
special case of a macro call seems at first elegant. However, it does
not take into account the fact that the two are actually processed
quite differently (one brings a change to the data base of the
program, the other generates mere text replacement). Since it is nice
to have macro definition trapped at the lexical level, rather than
having a parser routine process it, the solution seems to be a
"syntactic" interface between the scanner and the parser, which
processes definitions and returns only actual parsing tokens.
Besides doing this, this interface can trap and process a few other
special words, like "REQUIRE".
- Another GPM principle states that all arguments of a macro
call must be evaluated before expansion begins. This makes
implementation more difficult; on the other hand, the extra
facilities seem needless. The GPM version of SUCCESSOR above would
be, in our notation:
DEFINE SUCCESSOR(⊗X) = ⊂SUC(1,2,3,4,5,6,7,8,9,0,
DEFINE SUC(Y0,Y1,Y2,Y3,Y4,Y5,Y6,Y7,Y9)
= ⊂Y$X%⊃ ⊃;
I.e., the DEFINE SUC would itself be a parameter to the call
of SUCCESSOR.
Since the macro table is not looked up until all the macro parameters
have been evaluated (and at that time the macro name SUC is
associated with the right thing), the desired effect is obtained. It
seems to me, however, that this is needlessy tricky and that it is
just as neat to require that everything,including macros, be declared
before use.
- The conventions about special characters <start evaluating>
<stop evaluating>, <start quoting>, <stop quoting> and their nesting
properties make a GPM program very difficult to read.
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
≡ 4)DESCRIPTION OF THE PROCEDURES LEXAN and METALEXAN. ≡
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
4.1 SIMPLE INTERNAL PROCEDURE LEXAN;
------------------------------------------
This is a standard scanner. It skips over "SEPARATOR"
characters, reads a token, and returns its type (INTEGER
variable TOKEN), and string representation (SYMB). A "SEPARATOR"
is a character which will terminate a token (such as a number,
identifier etc.), but will not be part of any token.
If the token scanned is a defined macro name and EVALSWITCH
has value ON, or if the break character that stopped evaluation is a
"WATCH OUT: MACRO" character, the macro call is evaluated recursively
and replaced in the token returned to the calling program.
When an identifier is recognized, procedure SEARCHINSERT is
called; if the identifier was declared before, LEXAN will return the
code for its type in TOKEN; otherwise, if the global boolean
DONTINSERT is TRUE (default is FALSE), it will insert the id. The
user can inhibit the symbol table procedures by erasing a line in
INIT.SAI (i.e., without having to recompile IOREL and SEAR).
The exact list of the global variables used and/or modified
by LEXAN appears in a comment at the beginning of the procedure.
4.2 SIMPLE INTERNAL PROCEDURE METALEXAN;
--------------------------------------------
If the user does not erase the line
DEFINE LEXAN = ⊂METALEXAN⊃;
in the file INIT.SAI,then METALEXAN will be called in lieu of LEXAN.
Its only role is to call LEXAN, and process any macro definitions,
or REQUIRE constructs that might appear in the source text.
Thus only "true" tokens will be returned to the calling program.
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
≡ 5) OTHER PROCEDURES. ≡
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
5.1) SIMPLE INTERNAL BOOLEAN PROCEDURE PLEASE_ANSWER (VALUE STRING QUESTION);
--------------------------------------------------------------------------------
Types its argument and reads an answer "yes" or "no" at the
teletype. Very tolerant as for the syntax. For speed, <Cr> = No,
<Altmode> = Yes.
5.2) SIMPLE INTERNAL PROCEDURE OPEN_LOOKUP_ENTER(REFERENCE STRING file;
REFERENCE INTEGER channel;
VALUE INTEGER mode, inbufnum, outbufnum;
REFERENCE INTEGER count, brchar, eof);
----------------------------------------------------------------
The user should not normally have to bother with this
procedure (see the description of INITSCAN below).
Most parameters have the meaning defined in the SAIL manual
for the OPEN function.
If FILE is initially NULL, the user is asked for a filename.
If INBUFNUM > 0, the file is open for input.(If it does not
exist, the program keeps asking for a new name).
If OUTBUFNUM > 0, an ENTER is performed.
5.3) BOOLEAN PROCEDURE SEARCHINSERT (VALUE STRING id);
------------------------------------------------------
This procedure works on a symbol table based on bucket hash.
The current hash function, which seems to give good statistical
repartition on big SAIL programs, has 64 buckets and is expressed by
(first letter (modulo 8)) * (last letter (modulo 8))
The symbol table is organized with SAIL RECORD_CLASS'es and
RECORD-POINTER's (these are abbreviated as TYPE and REF. NULL_RECORD
is abbreviated as NIL). An entry in the symbol table is a record of
type ENTRI, with fields RTYPE (for the type), NAME (A string
itemvar), VAL(an integer which the scanner uses for macros as a
pointer to a macro body table), and LINK.
SEARCHINSERT returns TRUE if ID was already in the table.
Otherwise, it inserts it (at the "head" of its bucket), with RTYPE
initialized to 0, an returns FALSE. In all cases, the global
reference variable NEW_ID points to the record containing ID, which
is also BUCKET[HASH(ID)]
There is also a procedure ONLYSEARCH(.......) which is
identical to SEARCHINSERT, but does not insert new id's. This
procedure is local in IOREL.
5.4) SIMPLE PROCEDURE BREAKTABLES;
----------------------------------
Initializes (or reinitializes) the breaktables used by the
scanner, according to the current values of the array CHARCLASS.
5.5) SIMPLE PROCEDURE INITSCAN(REFERENCE STRING FILENAME; VALUE INTEGER INBUFNUM; VALUE BOOLEAN INSTYPES);
----------------------------------------------------------------------------------
Opens file FILENAME for scanner input on channel PROGCHAN, by
calling OPEN_LOOKUP_ENTER with parameters PCOUNT, PEOF, PBRCHAR
(these three are global integer variables), reading mode 0, 0 out
buffers, and INBUFNUM input buffers.
OPEN_LOOKUP_ENTER will ask for more information at execution
time if FILENAME is NULL or if the file called does not exist.
If the symbol table procedures are present, INITSCAN inserts
the scanner's reserved words into the symbol table; it will also
insert names of types if INSTYPES is TRUE.
5.6) SIMPLE PROCEDURE ERROR(VALUE STRING MESS);
-----------------------------------------------
Prints its string argument, along with the source file
sections recently read, and, if adequate, the macro text being
expanded and a pointer to it.
There is currently a small bug in ERROR: since the reading
procedure keeps a circular list of the 16 last elements read, and it
is this list which is output by ERROR as the "context" of the error,
things which are deleted by a break table (essentially, the ending
quotes in strings and "endcomment" characters) do not appear in this
"context". Maybe I'll fix that.
5.7 PROCEDURE PRINTABLE(VALUE INTEGER TYPVAR);
----------------------------------------------
Prints the symbol table. Best way to find out about the format
is to look at Appendix 2.
If TYPVAR is 1, all identifiers are printed; otherwise, only
identifiers of type TYPVAR will appear.
5.8 SIMPLE PROCEDURE NEWRES(VALUE STRING RESWORD; VALUE BOOLEAN SEARCHED);
--------------------------------------------------------------------------
Gives to identifier RESWORD a positive type number never
before attributed to a reserved word. Fills the corresponding RTYPE
field in the symbol table entri.
If SEARCHED is TRUE, NEWRES will assume that RESWORD has just
been searched and inserted by the scanner in the symbol table, and
that NEW_ID points thus at the right entri. Otherwise it will call
SEARCHINSERT and output an error message if RESWORD was previously
declared.
5.9 SIMPLE PROCEDURE NEWTYPE (VALUE STRING TYPENAME; VALUE BOOLEAN SEARCHED);
-----------------------------------------------------------------------------
Gives to identifier TYPENAME a "RTYPE" equal to TTYPE and a
negative "VAL" different from any one previously associated with a
type. The second argument has the same meaning as in NEWRES.
5.10 SIMPLE PROCEDUREs NEWBLOCK and LEAVEBLOCK;
-----------------------------------------------
These two procedures with no arguments can be helpful for a
compiler of a block-structured language. The first one saves the
current 64 buckets of the symbol table in, and the second one
retrieves them from, a stack of CONTEXT variables.
5.11 SIMPLE PROCEDUREs SKIP_DIRECTORY_PAGE (not necessary if the symbol
table feature is used) and CHECK_EXTENSION(REFERENCE STRING FILENAME,
VALUE STRING EXTENSION);
-------------------------------------------------------------------------
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
≡ 6) USER'S GUIDE. ≡
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡
6.1 Requiring the routines.
-------------------------------
To use the package: say
COPY INIT ← INIT[CSP,SYS]<CR> (size of the file: 1.1K).
Then REQUIRE "INIT" as a SOURCE_FILE. INIT will REQUIRE the
other necessary SOURCE_FILE's and LOAD_MODULE's. (Size of IOREL.REL +
SEAR.REL: 4.7 K).
INIT will expect a SIMPLE INTERNAL PROCEDURE MODIFSCAN to be
declared (possibly with empty body). Its normal purpose is to bring
changes to the standard values of CHARCLASS,e.g. CHARCLASS["#"] ←
DELIMITER.
To initialize the scanning process write in your program:
INITSCAN(<filename>, <inbufnum>).
<filename> must be a string variable or a string constant, which
contains the name of a file to be scanned. If there is an error in a
file specification at any point in the program (e.g., in a REQUIRE
SOURCE_FILE construct), the system will keep on asking.
INBUFNUM is the number of input buffers to be used by the
reading function.
The scanning process is then initiated; saying LEXAN will
cause one token to be scanned.
EXAMPLE
************************************************************************
REQUIRE "INIT" SOURCE_FILE;
INITSCAN("MICKEY[COMICS,WD]", 17);
LEXAN {This will scan the first token on MICKEY}
*************************************************************************
6.2 Options.
----------------
The standard system contains all the facilities described in
this document. The user, however, has the ability to get rid of one
or more of the following characteristics:
- Systematic call of METALEXAN at the beginning of each LEXAN
call; The words DEFINE and REQUIRE will not be recognized.
- Use of the symbol table. Recall that macros and the
scanner's reserved words (COMMENT, DEFINE, REQUIRE, and the words
that can appear after a REQUIRE) will not be recognized.
- If you want the program to recognize these special words,
along with macros, but not to insert new identifiers into the symbol
tables, then you should say DONTINSERT ← TRUE in your program; the
scanner will then use the symbol table for its own stuff, but not for
yours. DONTINSERT can of course be set and reset at different points.
***********************************************************************
* Page 6 of INIT indicates which lines should be erased in *
* order to effect the first two changes above. *
***********************************************************************
Note: There is no way to request the "scanning" system only,
without making use of the macro facility. However, if the user's
input program does not request any macro definitions and calls, the
overhead due to the existence of the macro facility in the package
will be negligible.
You might want to use the symbol table module, but add some
fields to the format of the entries. In that case, you should change
the declaration of RECORD_POINTER ENTRI in both INIT.SAI (INTERNAL),
SEAR.SAI and IOREL.SAI (EXTERNAL). IOREL and SEAR (or rather copies
on your area) will then have to be recompiled.
6.3 "Debug" mode.
---------------------
Saying DEBUGMODE ← TRUE at any point will cause the program
to print TOKEN and SYMB for each token scanned, along with
information about macro definitions, macro calls,symbol table
searches, processing of certain REQUIRE constructs, etc. One can
also include REQUIRE "ON" DEBUG_MODE; at any point in the program;
REQUIRing any DEBUG_MODE other than ON will terminate debug mode.
6.4 Test Program.
---------------------
RU TEST[CSP,SYS] will run the scanner in "Debug" mode on a
file containing all the examples of this document, in the order they
appear in it.
APPENDIX 1: Initial values for CHARCLASS
CHARCLASS is PRELOADed by INIT.SAI in the following way:
0 1 2 3 4 5 6 7
≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡|
000| NUL ↓ α β ∧ ¬ ε π |
| endfile illegal illeg. illeg. delimiter delim. illeg. illeg. |
|------------------------------------------------------------------|
010| λ TAB LF VT FF CR ∞ ∂ |
| illeg. separator separ. illeg. separ. separ. illeg. illeg. |
|------------------------------------------------------------------|
020| ⊂ ⊃ ∩ ∪ ∀ ∃ ⊗ ↔ |
| macbodst.macbodend illeg. illeg. illeg. illeg. illeg. illeg. |
|------------------------------------------------------------------|
030| _ → ~ ≠ ≤ ≥ ≡ ∨ |
| letter illeg. illeg. delim. delim. delim. illeg. delim. |
|------------------------------------------------------------------|
040| SP ! " # $ % & ' |
| separ. letter quote illeg. macrocall endmacro delim. octal |
|------------------------------------------------------------------|
050| ( ) * + , - . / |
| delim. delim. delim. delim. delim. delim. decimaldot delim. |
|------------------------------------------------------------------|
060| 0 1 2 3 4 5 6 7 |
| number (etc......................................................|
|------------------------------------------------------------------|
070| 8 9 : ; < = > ? |
| .............) delim. delim. delim. delim. delim. illeg. |
|------------------------------------------------------------------|
'100 @ A B C D E F G |
|illeg. letter (etc..............................................|
|------------------------------------------------------------------|
110| H I J K L M N O |
|..................................................................|
|------------------------------------------------------------------|
120| P Q R S T U V W |
|..................................................................|
|------------------------------------------------------------------|
130| X Y Z [ \ ] ↑ ← |
|.......................) delim. illeg. delim. illeg. delim. |
|------------------------------------------------------------------|
140| ` a b c d e f g |
|donteval letter(etc...............................................|
|------------------------------------------------------------------|
150| h i j k l m n o |
|..................................................................|
|------------------------------------------------------------------|
160| p q r s t u v w |
|..................................................................|
|------------------------------------------------------------------|
170| x y z { | ALT } BS |
+......................) ccomment illeg. illeg. endcomment illeg.|
|------------------------------------------------------------------|
Appendix 2: Legal REQUIREments.
The following REQUIRE constructs will be recognized and processed
by METALEXAN:
{ <char>, <char1> and <char2> mean "any character". "delim" means a
character whose CHARCLASS value is DELIMITER, either standardly or
because it has been REQUIRED DELIMITER. }
REQUIRE "<char1><char2>" COMMENT_DELIMITERS;
REQUIRE "<char1><char2>" MACROBODY_DELIMITERS;
REQUIRE "<char1><char2>" MACRO_DELIMITERS;
REQUIRE "<delim><char2>" DOUBLE_DELIMITER;
REQUIRE "<char>" ILLEGAL;
REQUIRE "<char>" LETTER;
REQUIRE "<char>" DELIMITER;
REQUIRE "<char>" IGNORED;
REQUIRE "<char>" DONTEVAL;
REQUIRE "<char>" OCTAL;
REQUIRE "<source_file>" SOURCE_FILE;
REQUIRE "<identifier>" NEW_TYPE;
APPENDIX 3: Reserved words.
INITSCAN preloads the following identifiers into the symbol
table, filling the fields as indicated (the list below was produced
by PRINTABLE(1) called after INITSCAN). The ones beginning in "T"
(names of types) are only inserted if the third argument of INITSCAN
is TRUE.
NAME RTYPE VAL LINK(UNKNOWN) HASH
IGNORED 144 0 12
ILLEGAL 141 0 12
REQUIRE 130 0 21
COMMENT_DELIMITERS 161 0 27
COMMENT 128 0 28
SOURCE_FILE 181 0 29
TCOMPLEX -31(TTYPE) -26 32
DOUBLE_DELIMITER 147 0 34
DELIMITER 143 0 34
LETTER 142 0 34
TINTEGER -31 -2 34
TINTVAR -31 -22 34
TREALVAR -31 -23 34
TSTRINGVAR -31 -24 34
TVECTOR -31 -30 34
TTRANS -31 -28 35
TCLASS -31 -32 35
DONTEVAL 145 0 36
TREAL -31 -3 36
TMACBODYSTART -31 -5 36
TCOMMENT -31 -10 36
TENDCOMMENT -31 -12 36
TNONDECLARED -31 -21 36
DEBUG_MODE 183 0 37
DEFINE 129 0 37
TSANSTYPE -31 0 37
TENDFILE -31 -6 37
TDEFINE -31 -11 37
TPROCEDURE -31 -25 37
TFRAME -31 -27 37
TPLANE -31 -29 37
TTYPE -31 -31 37
TDOUBLEDELIM -31 -33 37
TNOTHING -31 -1 39
TSTRING -31 -4 39
TMACRO -31 -13 39
MACRO_DELIMITERS 163 0 43
MACROBODY_DELIMITERS 162 0 43
NEW_TYPE 182 0 53
OCTAL 146 0 60
(Note: This list is somewhat embarrassing for one who has
claimed that his hash function was a good one. The repartition on the
big actual SAIL program tested was much nicer; my reserved words have
an unpleasant tendency to begin and end all with the same
characters).